On Evaluation of Automatically Generated Clinical Discharge Summaries
نویسندگان
چکیده
Proper evaluation is crucial for developing high-quality computerized text summarization systems. In the clinical domain, the specialized information needs of the clinicians complicates the task of evaluating automatically produced clinical text summaries. In this paper we present and compare the results from both manual and automatic evaluation of computer-generated summaries. These are composed of sentence extracts from the free text in clinical daily notes – corresponding to individual care episodes, written by physicians concerning patient care. The purpose of this study is primarily to find out if there is a correlation between the conducted automatic evaluation and the manual evaluation. We analyze which of the automatic evaluation metrics correlates the most with the scores from the manual evaluation. The manual evaluation is performed by domain experts who follow an evaluation tool that we developed as a part of this study. As a result, we hope to get some insight into the reliability of the selected approach to automatic evaluation. Ultimately this study can help us in assessing the reliability of this evaluation approach, so that we can further develop the underlying summarization system. The evaluation results seem promising in that the ranking order of the various summarization methods, ranked by all the automatic evaluation metrics, correspond well with that of the manual evaluation. These preliminary results also indicate that the utilized automatic evaluation setup can be used as an automated and reliable way to rank clinical summarization methods internally in terms of their performance.
منابع مشابه
Enhancing Patient Readability of Discharge Summaries with Automatically Generated Hyperlinks
Patients often have difficulty in understanding medical concepts and vocabulary in their Discharge Summaries. We explore automatic hyper-linking to online resources for difficult terms as a means of making the content more comprehensible for patients. We use the Consumer Health Vocabulary (CHV) as a resource for scoring the difficulty of terms and to provide the most consumerfriendly synonyms. ...
متن کاملDictated versus database-generated discharge summaries: a randomized clinical trial.
BACKGROUND Hospital discharge summaries communicate information necessary for continuing patient care. They are most commonly generated by voice dictation and are often of poor quality. The objective of this study was to compare discharge summaries created by voice dictation with those generated from a clinical database. METHODS A randomized clinical trial was performed in which discharge sum...
متن کاملEvaluation of computer generated neonatal discharge summaries.
Computer generated and dictated discharge summaries were compared for all 133 babies admitted for intensive and special care during a six month period. Whereas 130/133 (98%) had a computer generated summary, only 94/133 (71%) had a dictated summary. In addition, computerised summaries were completed at discharge, but there was a delay up to 26 weeks for dictated summaries. Dictated summaries ha...
متن کاملResearch Paper: The Evaluation of a Temporal Reasoning System in Processing Clinical Discharge Summaries
CONTEXT TimeText is a temporal reasoning system designed to represent, extract, and reason about temporal information in clinical text. OBJECTIVE To measure the accuracy of the TimeText for processing clinical discharge summaries. DESIGN Six physicians with biomedical informatics training served as domain experts. Twenty discharge summaries were randomly selected for the evaluation. For eac...
متن کاملThomson Reuters at TAC 2009: ContextChain and Fractional Conditional Compressibility of Models
This paper contains the results for the FastSum system and a simple baseline system for the TAC 2009 main task – update summarization –. For the pilot task of Automatically Evaluating Summaries of Peers (AESOP), we present two novel metrics. The first metric called ContextChain is an extension of a recently proposed metric AutoSummENG that is based on comparing n-gram graphs of the model summar...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014